A robust BFCC feature extraction for ASR system
نویسندگان
چکیده
An auditory-based feature extraction algorithm naming the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC) is proposed to increase the robustness for automatic speech recognition. Compared to Fourier spectrogram based of the MelFrequency Cepstral Coefficient (MFCC) method, the proposed BFCC method engages an auditory spectrogram based on a gammachirp wavelet transform to simulate the auditory response of human inner ear to improve the noise immunity. In addition, the Hidden Markov Model (HMM) is used for evaluating the proposed BFCC in phases of training and testing purposes conducted by AURORA-2 corpus with different Signal-to-Noise Ratios (SNRs) degrees of datasets. The experimental results indicate the proposed BFCC, compared with MFCC, Gammatone Wavelet Cepstral Coefficient (GWCC), and Gammatone Frequency Cepstral Coefficient (GFCC), improves the speech recognition rate by 13%, 17%, and 0.5% respectively, on average given speech samples with SNRs ranging from -5 to 20 dB.
منابع مشابه
Speech Recognition in Noisy Environment Using Different Feature Extraction Techniques
In this paper, different feature extraction methods for speech recognition system such as Melfrequency cepstral coefficients (MFCC), linear predictive coefficient cepstrum (LPCC) and Bark frequency cepstral coefficients (BFCC) are implemented and the comparison is done based on average recognition accuracy. We suggest a noise robust isolated word speech recognition system which can be applied i...
متن کاملWhy do ASR Systems Despite Neural Nets Still Depend on Robust Features
To which extent can neural nets learn traditional signal processing stages of current robust ASR front-ends? Will neural nets replace the classical, often auditory-inspired feature extraction in the near future? To answer these questions, a DNN-based ASR system was trained and tested on the Aurora4 robust ASR task using various (intermediate) processing stages. Additionally, the training set wa...
متن کاملMissing Feature Imputation of Log-spectral Data for Noise Robust Asr
In this paper, we present a missing feature (MF) imputation algorithm for log-spectral data with applications to noise robust ASR. Drawing from previous work [1], we adapt the previously proposed spectrographic reconstruction solution to the liftered log-spectral domain by introducing log-spectral flooring (LS-FLR). LS-FLR is shown to be an efficient and effective noise robust feature extractio...
متن کاملPitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR
A novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR) is proposed. A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously and it showed improved performances except when modulation enhancement was integrated with Wiener filter (WF)-based noise reduction and auditory masking. Ho...
متن کاملSpeech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR
The performance of an automatic speech recognition (ASR) system degrades severely in noisy and reverberant environments in part due to the lack of robustness in the underlying representations used in the ASR system. On the other hand, the auditory processing studies have shown the importance of modulation filtered spectrogram representations in robust human speech recognition. Inspired by these...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Artif. Intell. Research
دوره 5 شماره
صفحات -
تاریخ انتشار 2016